One-time pad booster for Internet 
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One-time pad encrypted files can be sent through Internet channels using current Internet proto- 
cols. However, the need for renewing shared secret keys make this method unpractical. This work 
shows how users can use a fast physical random generator based on fluctuations of a light field 
and the Internet channel to directly boost key renewals. The transmitted signals are deterministic 
but carries imprinted noise that cannot be eliminated by the attacker. Thus, a one-time pad for 
Internet can be made practical. Security is achieved without third parties and not relying on the 
difficulty of factoring numbers in primes. An informational fragility to be avoided is discussed. 
Information-theoretic analysis is presented and bounds for secure operation are determined. 
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Unconditionally secure one-time pad encryption 1] has 
not find wide applicability in modern communications. 
The difficult for users to share long streams of secret 
keys beforehand has been an unsurmountable barrier pre- 
venting widespread use of one-time pad systems. Even 
beginning with a start sequence of shared secret keys, no 
amplification method to obtain new key sequences or key 
"refreshing" is available. This work proposes a practical 
solution for this problem and discusses its own limita- 
tions. 

Assume that (statistical) physical noise n = n%,n2, ... 
has been added to a message bit sequence X = 
Xi,X2,... according to some rule fj(xj,rij) giving Y = 
fi(xi, ni), f2(%2, ^2)) ••• (Whenever binary physical sig- 
nals are implied, use fj(xj,rij) will represent fj = ® 
(=addition mod2)). When analog physical signals are 
made discrete by analog-to-digital converters, a sum of a 
binary signal onto a discrete set will be assumed). The 
addition process is performed at the emitter station and 
Y becomes a binary file carrying the recorded noise. Y 
is sent from user A to user B (or from B to A) through an 
insecure channel. The amount of noise is assumed high 
and such that without any knowledge beyond Y, neither 
B (or A) or an attacker E could extract the sequence X 
with a probability P better than the guessing level of 
P = (1/2) , where N is the number of bits. 

Assuming that A and B share some knowledge before- 
hand, the amount of information between A (or B) and 
E differs. Can this information asymmetry be used by A 
and B to share secure information over the Internet? It 
will be shown that if A and B start sharing a secret key 
sequence Ko they may end up with a practical new key 
sequence K 3> K . The security of this new sequence is 
discussed including an avoidable fragility for a-posteriori 
attack with a known-plaintext attack. Within bounds 
to be demonstrated, this makes one-time pad encryption 
practical for fast Internet communications (data, image 
or sound). It should be emphasized that being practical 
does not imply that Ko or the new keys have to be open 
to the attacker after transmission. These keys have to 



be kept secret as long as encrypted messages have to be 
protected, as in a strict one-time pad. The system gives 
users A and B direct control to guarantee secure commu- 
nication without use of third parties or certificates. Some 
may think of the method as an extra protective layer to 
the current Internet encryption protocols. The system 
operates on top of all IP layers and does not disturb 
current protocols in use by Internet providers. Anyway, 
one should emphasize that the proposed method relies on 
security created by physical noise and not just on math- 
ematical complexities such as the difficulty of factoring 
numbers in primes. This way, its security level does not 
depend on advances in algorithms or computation. 

Random events of physical origin cannot be determin- 
istically predicted and sometimes are classified in clas- 
sical or quantum events. Some take the point of view 
that a recorded classical random event is just the record 
of a single realization among all the possible quantum 
trajectories possible [2j. These classifications belong to a 
philosophical nature, and are not relevant to the practical 
aspects to be discussed here. However, what should be 
emphasized is that physical noise is completely different 
from pseudo noise generated in a deterministic process 
(e.g. hardware stream ciphers) because despite any com- 
plexity introduced, the deterministic generation mecha- 
nism can be searched, eventually discovered and used by 
the attacker. 

Before introducing the communication protocol to be 
used, one should discuss the superposition of physical sig- 
nals to deterministic binary signals. Any signal transmit- 
ted over Internet is physically prepared to be compatible 
with the channel being used. This way, e.g., voltage lev- 
els Vo and V\ in a computer may represent bits. These 
values may be understood as the simple encoding 
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Technical noise, e.g. electrical noise, in bit levels Vq 
and V± are assumed low. Also, channel noise are as- 
sumed with a modest level. Errors caused by these noises 
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are assumed to be possibly corrected by classical error- 
correction codes. Anyway, the end user is supposed to 
receive the bit sequence X (prepared by a sequence of 
V and Vi) as determined by the sender. If one of these 
deterministic binary signals Xj is repeated over the chan- 
nel, e.g. x\ — x and X2 — x, one has the known prop- 
erty x\ © x 2 = 0. This property has to be compared 
to cases where a non-negligible amount of physical noise 
rij (in analog or a discrete form) has been added to 
each emission. Writing y\ = /i(xi,ni) = f\(x,ni) and 
V2 = /2(^2,n 2 ) = h{x,U2) one has f(yi,y 2 ) = neither 
or 1 in general. This difference from the former case 
where x\ © x 2 =0 emphasizes the uncontrollable effect 
of the noise. 

The encoding shown above allows binary values 
Vq and Vi to represent bits and 1, respectively. These 
values are assumed to be determined without ambiguity. 
Instead of this unique encoding consider that two distinct 
encodings can be used to represent bits and 1: Either 
y(P) ver which Xq and Xi represent the two bits 
and 1 respectively, or over which x^ = x^ +e and 
Xq 1 -* = xf* 1 + e (e <C 1) represent the two bits 1 or (in 
a different order from the former assignment). These en- 
codings represent physical signals as, for example, phase 
signals. 

Assume noiseless transmission signals but where noise 
rij has been introduced or added to each j th bit sent 
(This is equivalent to noiseless signals in a noisy channel). 
Consider that the user does not know which encoding 
y(°) or V {1) was used. With a noise level rij superposed 
to signals in or and if \xq — Xq\ ^> rij 3> e, one 
cannot distinguish between signals and 1 in and 
y(l) = y(°)+ebut one knows easily that a signal belongs 
either to the set (0 in V^ ' or 1 in V^) or to the set 
(1 in or in V^). Also note that once the encoding 
used is known, there is no question to identify between Xj 
and Xj +e. In this case, it is straightforward to determine 
a bit or 1 because values in a single encoding are widely 
separated and, therefore, distinguishable. One may say 
that without information on the encoding used, the bit 
values cannot be determined. 

Physical noise processes will be detailed ahead but this 
indistinguishability of the signals without basis informa- 
tion is the clue for A and B to share random bits over the 
Internet in a secure way. Physical noise has been used 
before in fiber-optics based systems using M-ry levels [3[ 
to protect information (arj systems). However, the sys- 
tem proposed here is completely distinct from those arj 
systems and it is related to the key distribution system 
presented in 

A brief description of protocol steps will be made, be- 
fore a theoretic-security analysis is shown and the sys- 
tem's limitations discussed. It was said that if A and 
B start sharing a secret key sequence Ko beforehand 
they may end up with a secure fresh key sequence K 



much longer than Ko (K 3> Ko). Assume that Ko 
gives encoding information, that is to say, which encod- 
ing (V"(°) or V^) is being used at the j th emission. As- 
sume that Ko = k j° , ^2° , ■■■ has a length Kq and that 
the user A has a physical random generator PhRG able 
to generate random bits and noise in continuous levels. 
A generates a random sequence Ki = k[ X \ k^\ ...k^ 
(say, binary voltage levels) and a sequence of Kq noisy- 
signals n (e.g., voltage levels in a continuum). The de- 
terministic signal (carrying recorded noise) Yi = k^ © 
/i (k[ 1} ,n[ 1} ), k { 2 0) © / 2 {k { 2 1} , n 2 X) ) , . . . is then sent to B. Is B 
able to extract the fresh sequence Ki from Yi? B applies 
YiffiKo - f 1 (k[ 1 \n^J 2 (ki 1 \n^..J N (4\n^ ) ). 
As B knows the encoding used and the signals rep- 
resenting bits or 1 in a given encoding are eas- 
ily identifiable: /i^^, n^) -> k[ 1] , /a^, n 2 1} ) -> 

k 2 , — /iv(fcjv j n N^) ~~ ¥ ■ B then obtains the new ran- 
dom sequence Ki generated by A. 

Is the attacker also able to extract the same sequence 
Ki? Actually, this was a one-time pad with Ko with 
added noise and, therefore, it is known that the attacker 
cannot obtain Ki . The security problem arises for further 
exchanges of random bits, e.g. if B wants to share further 
secret bits with A. 

Assume that B also has a physical random genera- 
tor PhRG able to generate random bits and noise in 
continuous levels. B wants to send in a secure way 
a freshly generated key sequence K 2 = k± , k^\ —k^ o 
from his PhRG to A. B record the signals Y 2 = k[^ © 
/iCfcf^ra^).*^ © f 2 (k { 2 2 \n 2 2) ),... and sends it to A. 
As A knows Ki he(or she) applies Y 2 © Ki and extracts 
K 2 . A and B now share the two new sequences Ki and 
K 2 . For speeding communication, even a simple round- 
ing process to the nearest integer would produce a simple 
binary output for the operation fj(kj,rij). The security 
of this process will be shown ahead. 

The simple description presented show a key distribu- 
tion from A to B and from B to A, with the net re- 
sult that A and B share the fresh sequences Ki and K 2 . 
These steps can be seen as a first distribution cycle. A 
could again send another fresh sequence K3 to B and 
so on. This repeated procedure provides A and B with 
sequences Ki,K 2 ,K3,K4, .... This is the basic key dis- 
tribution protocol for the system. 

A last caveat should be made. Although the key 
sharing seems adequate to go without bounds, physical 
properties impose some constraints and length limita- 
tions. Besides these limitations, the key sequences shared 
should pass key reconciliation and privacy amplification 
steps [5( to establish security bounds to all possible E 
attacks. The length limitation arises from the physical 
constraints discussed as follows. 

A and B use PhRGs to generate physical signals creat- 
ing the random bits that define the key sequences K and 
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the continuous noise n necessary for the protocol. Be- 
ing physical signals, precise variables have to discussed 
and the noise source well characterized. Interfaces will 
transform the physical signals onto binary sequences ad- 
equate for Internet transmission protocols. Optical noise 
sources can be chosen for fast speeds. PhRGs have been 
discussed in the literature and even commercial ones are 
now starting to be available. Without going into details 
one could divide the PhRG in two parts, one generating 
random binary signals and another providing noise in a 
continuous physical variable (e.g., phase of a light field). 
These two signals are detected, adequately formatted and 
can be added. 

Taking the phase of a light field as the physical vari- 
able of interest, one could assume laser light in a coherent 
state with average number of photons (n) within one co- 
herence time ((n) — \a\ 2 3> 1) and phase 4>. Phases 
4> = could define the bit while <j) — it could de- 
fine the bit 1. It can be shown Q (see also ahead) 
that two non-orthogonal states with phases fa and fa 
(A0i2 = \fa — fa\ — ► and (n) ^> 1) overlap with (un- 
normalized) probability 



Pu 



-(A0 12 ) 2 /2^ 



(2) 



where — y/2/(n) is the standard deviation measure 
for the phase fluctuations A<fi. For distinguishable states, 
Pu (no overlap) and for maximum indistinguishabil- 
ity Pu = 1 (maximum overlap). With adequate format- 
ting fa — fa gives the spacing e (Afa 2 = e) already intro- 
duced. Eq. ([5]) with Afa 2 replaced by A(j> describes the 
probability for generic phase fluctuations Acf> in a coher- 
ent state of constant amplitude (\a\ = \J (n) ^constant) 
but with phase fluctuations. 

The laser light intensity is adjusted by A (or B) such 
that (70 3> Afa This guarantees that the recorded infor- 
mation in the files to be sent over the open channel is 
in a condition such that the recorded light noise makes 
the two close levels fa and fa indistinguishable to the 
attacker. In order to avoid the legitimate user to confuse 
0s and Is in a single basis, the light fluctuation should 
obey (70 <C 7r/2. These conditions can be summarized as 



| » » Acf> 



(3) 



This shows that this key distribution system depends fun- 
damentally on physical aspects for security and not just 
on mathematical complexity. 

The separation between bits in the same encoding is 
easily carried under condition ir/2 ^>> y/2/(n). The con- 
dition Y / 2/(n) 3> Acf) implies that that set of bits 0-in 
encoding 0, and 1— in encoding 1 (set 1) cannot be easily 
identifiable and the same happens with sets of bit 1-in 
encoding 0, and bit 0-in encoding 1 (set 2). Therefore, 
for A, B and E, there are no difficulty to identify that a 
sent signal is in set 1 or 2. However, E does not know the 



encoding provided to A or B by their shared knowledge 
on the basis used. The question "What is the attacker's 
probability of error in bit identification without repeating 
a sent signal?" has a general answer using information 
theory applied to a binary identification of two states [|| : 
The average probability of error in identifying two states 
IV'o) and is given by the Helstrom bound @ 
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Here |^o) an d l^i) are coherent states of light 0] with 
same amplitude but distinct phases 



\^) = \a) = \\c 



(5) 



defined at the PhRG. |^o) define states in encoding 0, 
where bits and 1 are given by 



\a), for bit 0, and 
I — a) , for bit 1 , 



(6) 



IV^i) define states in encoding 1, where bits 1 and are 
given by 



I fa 



\\a\e 1 2 ), for bit 1, and 
\\a\e~ l (^+^), for bit , 



(7) 



where \fa — fa\ = Acf>. |(V'o| 1 /'i)| 2 is calculated in a 
straightforward way and gives 



\{fa\fa}\' = e 



2 _ -2<n> 1-cos 



For (n) > 1 and Acj) < 1, 



\Wfa)\' 



(8) 



(9) 



where = \J1j (n) is the irreducible standard deviation 
for the phase fluctuation associated with the laser field. 

One should remind that in the proposed system the 
measuring procedure is defined by the users A and B and 
no attack launched by E can improve the deterministic 
signals that were already available to him(her). Thus, 
the noise frustrating the attacker's success, cannot be 
eliminated or diminished by measurement techniques. 

One should observe that each random bit defining the 
key sequence is once sent as a message by A (or B) and 
then resent as a key (encoding information) from B (or 
A) to A (or B). In both emissions, noise is superposed to 
the signals. In general, coherent signal repetitions implies 
that a better resolution may be achieved that is propor- 
tional to the number of repetitions r. This improvement 
in resolution is equivalent to a single measurement with 
a signal r x more intense. To correct for this single rep- 
etition (n) is replaced by 2(n) in |(-0o|V'i)| 2 - The final 
probability of error results 
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FIG. 1: Affj-^ggg as a function of (n) and A(j>. 

This error probability can be used to derive some of the 
proposed system's limitations. The attacker's probability 
of success P s (= 1 — P e ) to obtain the basis used in a 
single emission may be used to compare with the a-priori 
starting entropy H^ &sos of the two bases that carry one 
bit of the message to be sent (a random bit). If the 
attacker knows the basis, the bit will also be known, with 
the same probability — > 1 as the legitimate user. 

#bases,bit = ~Po ^gpo ~ Pi logpi = 1 , (11) 

where po and p\ are the a-priori probabilities for each 
basis, po — pi — 1/2, as defined by the PhRG. The en- 
tropy defined by success events is H 8 = —P s \ogP s . The 
entropy variation AH — -ffbases bit — Hs > statistically 
obtained or leaked from bit measurements show the sta- 
tistical information acquired by the attacker with respect 
to the a-priori starting entropy: 

A-^bases = (#bases,bit _ #«) ■ ( 12 ) 

Fig. [1] shows AH-^ ases for some values of (n) and Atfi. 
Value AiJjj^gg = 1/2 is the limiting case where the two 
bases cannot be distinguished. A-H^ases deviations from 
this limiting value of 1/2 indicates that some amount of 
information on the basis used may potentially be leaking 
to the attacker. It is clear that the attacker cannot obtain 
the basis in a bit-by-bit process. In order to be possible 
to obtain statistically a good amount of information on 
a single one encoding used, L should be given by 

L x (Atf bases - > 1 . (13) 

Fig. [2] shows estimates for L for a range of values (n) 
and A<p satisfying L x (AiJ^^gg — i) = 1 (A</> is given in 
powers of 2, indicating bit resolution for analog-to-digital 
converters). 

It is assumed that error correction codes can correct for 
technical errors in the transmission/reception steps for 
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FIG. 2: Estimates for the minimum length of bits L ex- 
changed between A and B that could give one bit of infor- 
mation about the bases used to the attacker. 



the legitimate users. The leak estimate given by Eq. (fl3|) 
do not imply that the information actually has leaked to 
the attacker. However, for security reasons, one takes for 
granted that this deviation indicate a statistical fraction 
of bits acquired by the attacker. 

Privacy amplification procedures can be applied to the 
shared bits in order to reduce this hypothetical informa- 
tion gained by the attacker to negligible levels Q . These 
procedures are beyond the purposes of the present dis- 
cussion but one can easily accept that A and B may dis- 
card a similar fraction of bits to statistically reduce the 
amount of information potentially leaked. Reducing this 
fraction of bits after a succession of bits are exchanged 
between A and B implies, e.g., that the number of bits to 
be exchanged will decrease at every emission. Eventually, 
a new shared key Ko has to start the process again to 
make the system secure. Nevertheless, the starting key 
length Ko was boosted in a secure way. Without further 
procedures, the physical noise allowed K ^> 10 3 Ko, a 
substantial improvement over the classical one-time pad 
factor of 1. One may still argue that the ultimate secu- 
rity relies on Ko's length because if Ko is known no secret 
will exist for the attacker. This is also true but does not 
invalidate the practical aspect of the system, because the 
Ko length can be made sufficiently long to frustrate any 
brute-force attack at any stage of technology. Therefore, 
the combination of physical noise and complexity makes 
this noisy-one-time pad practical for Internet uses. 

Although the security of the process has been demon- 
strated, one should also point to a fragility of the system 
(without a privacy amplification stage) that has to be 
avoided when A and B are encrypting messages X be- 
tween them. As it was shown, knowledge of one sequence 
of random bits lead to the knowledge of the following 
sequence. This makes the system vulnerable to know- 
plaintext attacks in the following way: E has a perfect 
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record of both sequences Yi and Y2 and tries to recover 
any bit sequence from them, K2, Ki or Ko. E will wait 
until A and B uses these sequences for encryption before 
trying to brake the system. A and B will encrypt a mes- 
sage using a new shared sequence, Ki or K2. This mes- 
sage could be a plain-text, say X = xi, X2, ■■•Xk known 
to the attacker. Encrypting this message with say Ki in a 
noiseless way, gives Y = X\ © k[ 1 \x 2 © k^\ —Xk © ■ 
Performing the operation Y © X, E obtains Kj. The 
chain dependence of K 3 on Kj_i creates this fragility. 
Even addition of noise to the encrypted file does not elim- 
inate this fragility, because the attacker can use his/her 
knowledge of X -as the key- to obtain K-as a message. 
The situation is symmetric between B or the attacker: 
one that knows the key (X for E, and K for B) obtains 
the desired message (K for E, and X for B) Q . 

In general, random generation processes are attractive 
to attackers and have to be carefully controlled. Well 
identifiable physical components (e.g. PHRG) are usu- 
ally a target for attackers that may try to substitute a 
true random sequence by pseudo-random bits generated 
by a seed key under his/her control. Electronic compo- 
nents can also be inserted to perform this task replacing 
the original generator; electric or electromagnetic signal 
may induce sequences for the attacker and so on. In the 
same way, known-plaintext attacks also have to be care- 
fully avoided by the legitimate users. The possibility of 
further privacy amplification procedures to eliminate the 
known-plaintext attack presented is beyond the purposes 
of this work. 

Many protocols that use secret key sharing may profit 
from this one-time pad booster system. For example, 
besides data encryption, authentication procedures can 
be done by hashing of message files with sequences of 
shared secret random bits. Challenge hand-shaking may 
allow an user to prove its identity to a second user across 
an insecure network. 

As a conclusion, it has been shown that Internet users 
will succeed in generating and sharing, in a fast way, a 
large number of secret keys to be used in one-time-pad 
encryption as described. They have to start from a 
shared secret sequence of random bits obtained from a 
physical random generator hooked to their computers. 



The physical noise in the signals openly transmitted 
is set to hide the random bits. No intrusion detection 
method is necessary. Privacy amplification protocols 
eliminate any fraction of information that may have 
eventually obtained by the attacker. As the security 
is not only based on mathematical complexities but 
depend on physical noise, technological advances will 
not harm this system. This is then very different from 
systems that would rely entirely, say, on the difficulty 
of factoring large numbers in their primes. It was then 
shown that by sharing secure secret key sequences, one- 
time pad encryption over the Internet can be practically 
implemented. 

*E-mail: GeraldoABarbosa@hotmail.com 
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